Does high number of output labels affect the performance of BERT and how to handle the class imbalance issue while doing multi text classification?

I am using BERT to do multiclass text classification. The number of output classes I have to predict from is: 116 and there is high degree of class imbalance that I see.
We have the following kind of records available for each of the classes:
{‘Class A’: 975 number of records,
‘Class B’: 776 number of records,
‘Class C’: 533 number of records,
‘Class D’: 412 number of records,
‘Class E’: 302 number of records,
‘Class F’: 250 number of records,
‘Class G’: 207 number of records,
‘Class H’: 137 number of records,
‘Class I’: 96 number of records,
‘Class J’: 51 number of records,
‘Class K’: 28 number of records,
‘Class L’: 17 number of records,
‘Class M’: 7 number of records,
‘Class N’: 2 number of records}

So I have two questions here:
Question1: As we have around 116 output classes to predict from, does that affect the performance of BERT due to the high number of output classes?

Question2: My original data has the similar type of class distribution that I have illustrated above. So how does this affect the performance of BERT and if it affects how do we handle this to get proper output?

Looking forward to get answer from the talented community we have here.

Much thanks in advance.

2 Likes

@swagat1509 Were you able to solve this ? I have the same scenario with around 106 classes, and highly imbalanced dataset, like 23k records for some class, and 2 records for some other class. I tried different models like distilbert-base-uncased, bert-base, deberta, roberta, bigbird, with different hyperparameter combinations, and different loss functions like focal loss, weighted loss etc., but I am not able to break the accuracy mark of 84 %. Please reply, if possible. Also, if someone else can help me in this scenario, your help would be greatly appreciated

1 Like

He seems to have gotten the answer itself. It doesn’t seem easy to improve performance…
https://datascience.stackexchange.com/questions/120215/does-high-number-of-output-labels-affect-the-performance-of-bert-and-how-to-hand